arXiv: 1508.00427vl [astro-ph.EP] 3 Aug 2015 


Draft version August 4, 2015 

Preprint typeset using style emulateapj v. 5/2/11 


A STATISTICAL SEARCH FOR A POPULATION OF EXO-TROJANS IN THE KEPLER DATASET 

Michael Hippke 

Luiter Strafie 21b, 47506 Neukirchen-Vluyn, Germany 


Daniel Angerhausen 

NASA Postdoctoral Program Fellow, NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA 

Draft version August 4, 2015 

ABSTRACT 

Trojans are small bodies in planetary Lagrangian points. In our solar system, Jupiter has the largest 
number of such companions. Their existence is assumed for exoplanetary systems as well, but none 
has been found so far. We present an analysis by super-stacking ^ 4 x 10"^ Kepler planets with a total 
of ~ 9 X 10^ transits, searching for an average trojan transit dip. Our result gives an upper limit to 
the average Trojan transiting area (per planet) corresponding to one body of radius < 460km at 2cr 
confidence. We find a significant Trojan-like signal in a sub-sample for planets with more (or larger) 
Trojans for periods >60 days. Our tentative results can and should be checked with improved data 
from future missions like PLATO 2.0, and can guide planetary formation theories. 



1. INTRODUCTION 

Back in 1771, the mathematician Lagrange found a 
solution of the three-body-problem for a primary planet 
and an asteroid of small mass. When the bodies are in 
the same plane in circular orbits of the same period, the 
stable locations for the asteroid are 60° from the planet 
(Lagrange 1772). As no such asteroids where known 
at the time, the problem was considered to be only of 
mathematical interest. Today, we refer to these points 
as the (stable) Lagrangian points L4 and L5 (Figure 1, 
based on Cornish (1998)). 

More than a century later. Max Wolf (1906) of 
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the University Observatory Heidelberg discovered a new 
“planet” 55° east of Jupiter and immediately noted its 
strange orbit: “the small change in R.A. is remarkable” ^ . 
More such bodies were found in the same year, and it was 
realized quickly that these are trapped in Jupiter’s La¬ 
grangian points. To distinguish them from the main belt 
asteroids, which usually receive female names, it was de¬ 
cided to name them after Greek heroes of the Trojan war. 
Wolf’s “planet” is today known as (588) Achilles, and is 
in the L4 group. 

Asteroids that are trapped in L4 or L5 orbit around 
their point of equilibrium in a tadpole, or horseshoe or¬ 
bit (Marzari et al. 2002). Today, > 6,000 Jupiter Tro¬ 
jans are known^, as well as a few Neptune-, Mars- and 
Earth Trojans. The largest known Trojans have sizes 
> 100km in radius (Fernandez et al. 2003), and it is 
believed that the total number of L4 Jupiter Trojans, 
with radii > 1km, is ~ 6 x 10® (Yoshida & Nakamura 
2005). If L5 contains an equal amount of debris, then the 
total transiting area equivalent of small Jupiter Trojans 
is corresponding to one body of radius ^600km. The 32 
largest objects (Fernandez et al. 2003) account for an 
additional radius equivalent of ~300km. 

The Lagrangian points are stable over Gyr timescales, 
as long as the planet is < 4% of the system mass (Murray 
& Holman 1999). Most of the system mass is usually 
concentrated in the host star, e.g. 4% of Mq is 40Mj„p, 
so that this limit is usually met. Consequently, we might 
assume that other planetary systems also posses Trojan 
bodies; this is also expected from formation mechanisms 
in protoplanetary accretion discs (Laughlin & Chambers 
2002). As the properties of extrasolar systems are di¬ 
verse, we can ask the question of how large these bod¬ 
ies can be, and how many there are. There is nothing 
that physically prevents them from occurring in larger 

^ In original German language: “Bemerkenswerst ist (...) die 
kleine RA.-Bewegung von TG” 

^ lAU Minor Planet Center, http://www .minorplanetcenter. 
net/iau/lists/JupiterTrojans .html, list retrieved on 26-Apr 
2015 
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numbers (and/or larger sizes) than in our own system. 
Hypothetical exo-Trojans have been shown to be stable 
for up to Jupiter mass in the most extreme cases (Erdi 
et al. 2007), assuming low eccentricity (Dvorak et al. 
2004). 

Searching for Trojans in time-series photometry is dif¬ 
ficult, as these bodies librate around their equilibrium 
points to a substantial degree. This produces large tran¬ 
sit timing variations, so that they are missed in standard 
planet-search algorithms. Also, the mean inclination of 
Jupiter Trojans is 10° (Yoshida & Nakamura 2005) to 
14° (Jewitt et al. 2000). If this is typical for exo-Trojans, 
only part of the swarm would go into transit. 

Data from the Kepler space telescope have been 
searched for individual Trojans, with a null result and 
sensitivity down to ~ li?^ (Janson 2013). Another 
search was carried out with data from the MOST satel¬ 
lite for the transiting Hot Jupiter HD 209458b, also with 
a null result and an upper limit of ^ 1 lunar mass of 
asteroids (Moldovan et al. 2010). 

Although interesting, we do not repeat these searches 
for individual Trojans here, but ask the question of the 
average Trojan effect in all Kepler data. Millions of small 
Trojans might show up, on average, when stacking ^ 
4 X 10^ planets with a total of ~ 9 x 10^ transits, as is 
the case for exo-moons (Hippke 2015). 

2. METHOD 

We employed the largest database available: High pre¬ 
cision time-series photometry from the Kepler space¬ 
craft, covering 4 years of observations (Caldwell at al. 

2010 ). 

2.1. Data selection 

Based on a list of all validated transiting Kepler plan¬ 
ets (821) and un-validated planet candidates (3,359) 
(Wright et al. 2011)^, we downloaded their Kepler 
long-cadence (LC, 30min) datasets. We used the same 
dataset as published by the Transiting Planet Search 
(TPS) pipeline, which relies on a systematic error- 
corrected flux time series from a “wavelet-based, adap¬ 
tive matched filter that characterizes the power spectral 
density (PSD) of the background process yielding the 
observed light curve and uses this time-variable PSD es¬ 
timate to realize a pre-whitening filter and whiten the 
light curve” (Borucki et al. 2011). This dataset was 
used for most planet validations (e.g., Lissauer et al. 
(2012); Rowe et al. (2014)). We have downloaded these 
data from the NASA Exoplanet Archive^ and applied no 
further corrections or detrending. It must be assumed 
that there are unidentified transits, and stellar trends, in 
these data, but we can also assume that these are dis¬ 
tributed randomly over phase time, so that no systematic 
effect should affect the precise locations of the L4 and L5 
phase time. 

2.2. Data proeessing 

Each planet has its own light curve in this dataset, 
which comes with companion transits removed (in mul¬ 
tiple systems). We phase-folded every lightcurve with its 

3 WWW. http://exoplanets.org, list retrieved on 18-Nov 2014 

^ http://exoplanetarchive.ipac.caltech.edu/docs/API_tce_ 
columns.html, retrieved on 21-Apr 2015 


published period. Afterwards, we re-normalized the data 
for each curve, while masking the times around plane¬ 
tary primary and secondary transit. Then, we re-binned 
each phase-folded lightcurve in 1,000 bins. Depending 
on the period, this is equivalent to a time of Imin (for 
the shortest period) to 18hrs (for a 750-day period). Eor 
the median period of 13 days, the bin length is 20min. 
As the average transit duration is a few hours, smearing 
occurs only for the few very long period planets. 

2.3. The super-stack 

Erom the sample of 3739 useful phase-folded 
lightcurves in 1,000 bins, we created a super-stack by co¬ 
adding these, and taking the median of each bin. This 
method was also used by Sheets & Deming (2014) for the 
detection of average secondary eclipses, and by Hippke 
(2015) for the search of an average exo-moon effect. In 
contrast to these studies, we did not stretch to the ex¬ 
pected transit duration, because Trojans are expected to 
be in orbits around the Lagrange-points, and not station¬ 
ary in phase time. 

The resulting data were strongly dominated by out¬ 
liers. This is caused by many factors contributing to 
different noise levels: The brightness of the host star, 
stellar variability, instrumental differences, and others. 
We decided on two filters to remove outliers, namely the 
stellar brightness (we kept stars brighter than 15mag in 
J as measured by 2MASS), and the scatter per star (we 
kept the better half). 

It is interesting to mention a (slight) selection effect 
from this filter choice: When rejecting dimmer and/or 
noisier stars, the average stellar radius changes. Smaller 
stars (e.g. M-dwarfs) exhibit more stellar noise (Basri et 
al. 2013) and are usually less luminous. Consequently, 
our sample is shifted towards larger stellar radii. While 
the total AepZer-planet sample has an average stellar ra¬ 
dius of 1.14it!0, our post-filter sample has an average of 
1.17i?©. 

3. RESULTS 

The initial post-filtered superstack does not show any 
significant Trojan dips, as shown in Eigure 2. When 
taking the average flux in a bin of 0.03 width in phase 
space, we obtain -1-0.06 ± 0.23ppm for L4, and —0.10 ± 
0.23ppm for L5. Eor the average stellar radius of 1.17i?0, 
we can set an upper limit for the average Trojan area (per 
planet) of 460km at 2a confidence. This applies to the 
full (filtered) Kepler sample. 

3.1. Cross-check for secondary eclipses 

As a useful cross-check for our data preparation 
method, we have searched for the average secondary 
eclipse. Eor simplicity, we have assumed only reflected 
light with an average albedo of 0.22 (Sheets & Deming 
2014) and neglected differences in temperatures. As 
can be seen in Eigure 2, there is only a hint of a sec¬ 
ondary feature at phase 1.0 = 0.0, which is measured to 
be —0.31±0.21ppm at a bin width of 0.01. Eollowing our 
simplified assumptions, we can calculate the expected dip 
for this sample as (Rp/aY per planet, giving an average 
of -0.88ppm for the sample. We explain the difference as 
caused by smearing from shifts in transit timing (from 
non-zero eccentricity) and different transit durations of 
each planet, which we did not compensate for. 
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Fig. 2.— The initial superstack shows no significant dips at the 
Lagrangian points. 

We have also checked a different sample which is ex¬ 
pected to yield a higher secondary dip: All planets with 
radii > 2R^ on orbits < 0.3au. This sub-sample is ex¬ 
pected to yield an average secondary eclipse of -l.Tppm; 
our data analysis gives —0.73 ± 0.34 in the same bin 
width. Again, we have to expect centering variations 
which reduce (broaden) the observed depth. It is how¬ 
ever reassuring to see a dip at > 2cr significance. Fi¬ 
nally, we have checked the few individual examples from 
Sheets & Deming (2014) where the secondary eclipse is 
detected for individual planets (e.g. Kepler-lOb); these 
dips are also present in our dataset. We conclude that 
there seems to be no obvious fault in our dataset, and 
that secondary eclipses are hard to detect for most of the 
Kepler planets. 

3.2. Sub-sample analysis 

The full sample might be heavily diluted by a large 
number of systems with no, or relatively few, Trojans. 
We test this hypothesis by assuming that the flux in L4 
and L5 is uncorrelated for any other cause than Trojan 
bodies. We know from our own solar system that the 
number of bodies in L4 and L5 is approximately equal. 
Then, we can examine a sub-sample of the superstack: 
We take all those planets that exhibit a negative flux at 
L4 (phase 0.33), and take their data of phase 0.5..1 for 
further analysis. The same is done for L5 in the reverse 
logic. This gives us 1, 251 samples of negative flux at 
phase 0.33, and their light curves for the “right part”, 
i.e. flux phase 0.5..1. We also find 1,266 samples with a 
dip at phase 0.66, and take their part of the light curves 
from phase 0..0.5. Afterwards, we stitch these halves 
together, and obtain 1,940 lightcurves (some have dips in 
both halves). The result of this sub-sample is shown in 
Figure 3 and exhibits a clear dip at both L4 and L5, with 
a maximum depth of 2ppm (970km radius equivalent). It 
is re-assuring to see that this dip is not uniform; as can be 
seen in the double-phase fold (right part of this figure), 
its shape is elongated away from planetary transit, as 
is expected for distributions from horseshoe and tadpole 
orbits. 

An alternative interpretation of the dips in figure 3 
(left) would be numerical fluctuations, caused by auto¬ 
correlation. Indeed, a Durbin-Watson test returns clear 


autocorrelation {p = 0.01) if the complete dataset of 
1,000 bins is used. However when we excise the phase 
times with signals (around 0, 0.33, 0.5, 0.66 and 1) and 
treat only the remaining data, then autocorrelation is 
insignificant even at the 10% level. 

We have also cross-checked whether this dip is intro¬ 
duced by some symmetry artifact. When selecting any 
other phase-folded time, e.g. flux < 0 at phase 0.2, no 
equivalent dip on the “other side” of the orbit, i.e. at 
phase 0.8, can be reproduced. Figure 4 shows this: If an 
artifact were present, we would expect a dip centered at 
each red mark, which is not present. Clearly, we cannot 
produce a similar dip at any phase time with this sym¬ 
metry argument; it only works at phase times 0.33 (L4) 
and 0.66 (L5). We caution, however, that the S/N of the 
total signal is low, as will be explained in the following 
section. Splitting such a weak signal into different views 
can therefore only create weak indicators of its validity. 

3.3. Significance of the result 

To measure the significance of these dips, we take the 
signal-to-noise ratio for transits (Jenkins et al. 2002; 
Rowe et al. 2014), which compares the depth of the 
transit mode compared to the out-of-transit noise: 

S/N = v^— (1) 

(Jot 

with Nt as the number of transit observations, T^ep 
the transit depth and aoT the standard deviation of out- 
of-transit observations. For the Lagrangian signals, we 
find the L4 and L5 dip at S/N~6.7 each, and a com¬ 
bined S/N=9.3. It has been argued by Fressin et al. 
(2013) that the detection of transits becomes unreliable 
for a S/N < 10, so that this signal can only qualify as a 
tentative detection. 

3.4. Sub-sample properties 

We have compared the properties of the 1,940 planets 
in our sub-sample to the total Kepler sample. We use 
a nonparametric density estimation with a local polyno¬ 
mial regression to include local confidence bands (e.g., 
Ruppert et al. (2003); Takezawa (2006)). 

We find a correlation of the Trojan-like dips to the pe¬ 
riod of the host star: At p > 20 days, the probability 
density moves towards more pronounced Trojan-like sig¬ 
nal, but the effect becomes only formally significant for 
60 < p < 350 days. This might either reflect a stability 
dependence of Trojan bodies to their semi-major axis, 
or a formation bias, or a mix of both. Due to radia¬ 
tion effects (such as the Yarkovsky effect (Bottke et al. 
2006), which can cause small objects to undergo orbital 
changes), we can expect few (if any) close-in (p < 10 
days) small asteroids. For long periods, p > 350 days, 
the sample size is too small for a significant result. 

We have tried the same density estimates for several 
other measures, but all are formally insignificant. At 
first glance, for example, one might assume that the im¬ 
pact parameter of the planetary transit could positively 
affect trojan detectability: For more central planetary 
transits, one might assume a sky-coplanar trojan to also 
(more likely) transit centrally, making detection easier. 
This is, however, likely overpowered by the inclination 
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Fig. 3.— Sub-sample superstack in normal (left) and double symmetric (right) phase fold, with expected orbit size shown for reference. 
Note different vertical axes. Gray dots are 1,000 bins over phase space, black dots with error bars (right) are 100 bins for better visibility. 



Fig. 4.— Cross-check of sub-sample selection artifacts. In each 
line, we select those data that have a dip on one side of phase 
space, and plot their flux only for the other half of phase space. 
For example, in the first row, all data are shown that have a dip at 
phase 0.034 (at boxcar width 0.03). If an artifact were present, we 
would expect a corresponding dip (red color) at phase 1 — 0.34 = 
0.966, which is not seen. Also, we would expect such an artifact 
to occur in every line, centered at the black marked, which is not 
the case. Instead, we mainly see red dips occur at phase ~ 0.66, 
where the Trojan transits are expected. A few columns (0.025 in 
phase time) around primary transits are excised as these values 
are ~ 1000 x deeper in flux, making them incompatible with useful 
color scalings. 

scatter of possible trojan “swarms”. If we take the incli¬ 
nation scatter of Jupiter’s trojans as a proxy, then the 
projected size of a trojan cloud in exoplanetary systems 
would be several times larger than the projected size of 
their star. Given our data quality, it is not surprising to 
find no significant correlation with respect to the impact 
parameter. 

The same argument can be made for a correlation to 
the stellar radius. One might hypothesize that the total 
trojan mass correlates to the total mass of the circumstel- 
lar disk, which itself might be correlated with the mass 
of the star. More massive stars are known to host more 
massive planets (Johnson et al. 2010), so that Trojan 
bodies might also be more massive (i.e., larger at a given 
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Fig. 5.— Density estimate for the Trojan-like signal versus plan¬ 
etary period. The shaded area is the local 2(j confidence band. 
Short periods < 50 days are consistent with zero trojan signal, but 
a trojan-like signal is detected for periods between 60 and 350 days. 
Uncertainty increases for longer periods due to a lack of data. See 
text for discussion. 

density). The scale for such a correlation, however, is 

1 /3 

expected to be of order Rtrojan ~ Krojan- The major¬ 
ity of Kepler planets and candidates are found for star 
between 0.5i?* and 2i?*, a range which, in combination 
with the limited data quality, does not allow for the de¬ 
tection of a significant correlation of trojan occurrence 
to the stellar radius. 

Finally, we also tried correlations to the planetary ra¬ 
dius, multiplicity, metallicity of the host star, and eccen¬ 
tricity; all of which gave null results. 

4. DISCUSSION 

While the data quality from Kepler is the best we have, 
it is only barely sufficient to search for Trojan bodies. 
Still, we believe that the methods outlined in this paper 
will be valuable in the future, when more and better data 
come available. The PLATO 2.0 mission (Rauer et al. 
2014) will deliver photometry for 500 bn stars in the years 
after 2024, and up to 3x better photometric precision. 
With such a dataset, the analysis performed here should 
be repeated, and should yield highly significant results 
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for every breakdown. Also, a few single large Trojans 
might also be expected, if the vast dataset (Hippke & 
Angerhausen 2015) can be mined sufficiently. 

We have explored the potential of PLATO 2.0 using 
lower-limit estimates from Rauer et al. (2014) for its 
scientific return. Then, ^ lOx more lightcurves will be 
available, when compared to Kepler, for a duration of 6 
(instead of 3) years. For simplicity, we neglect the bet¬ 
ter instrumental noise properties. This gives l/-v/2 x 10 
of noise improvement per bin (compared to the current 
data), resulting in ~ O.Sppm of noise in each of the 1,000 
bins. Consequently, from a superstack without any se¬ 
lections, we can expect equal or better signal-to-noise 
properties than in the heavily selected and biased Kepler 
sample. More precisely, the full PLATO 2.0 sample is 
expected to yield a signal-to-noise as shown in Figure 3, 
without any of our discerning selection choices. 

Furthermore, we can expect to make clear detections of 
large individual Trojans with PLATO 2.0, if such bodies 
exist with transiting areas > O. 5 i? 0 . To show this, we 
have created a series of injections. Our process is similar 
to the one described in Hippke & Angerhausen (2015). 
In short, we take real solar data from VIRGO/DLARAD 
(Frohlich et al. 1997) and add instrumental noise from 
an end-to-end PLATO 2.0 simulator (Zima et al. 2006; 
Marcos-Arenal et al. 2014). Into these data, we in¬ 
ject synthetic Trojan lightcurves, assumed to orbit in 
horseshoe-orbits around L4/L5 with semi-major axes of 
^ 0.02 in phase-time (Janson 2013). To explore the pa¬ 
rameter space of recoverable signals, we varied the tran¬ 
siting Trojan area (radius), the stellar radius, and the 
planetary period. We show an exemplary riverplot in 
Figure 6 for a Sun-like star, orbited by a 10-day Hot 
Jupiter and a O. 8 R 0 Trojan in L4. Such a configura¬ 
tion is easily detected in visual examination, but might 
escape standard transit detection algorithms due to the 
large transit timing variations. We find that for Sun-like 
stars, transiting areas of > O.65R0 (Mars-size) are easily 
detected visually; the equivalent limit for a O.bR© M- 
Dwarf is ^O. 5 R 0 . Instead of a visual search, algorithms 
might be used, as explained by Janson (2013). It is how¬ 
ever unclear how efficient these can be; this would have 
to be determined with a series of injections and (blind) 
retrievals. 

An interesting finding from our own injections is that 
potential Trojans at shorter period planets are much 
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more easily identihed due to their higher number of tran¬ 
sits. Our example uses a 10-day planet; this is on the 
lower end of our hypothesized limit of planets having 
Trojans, as explained in section 3.2. For longer period 
planets, e.g. a 100 -day planet, the number of rows (tran¬ 
sits) in Figure 6 would be 20 (instead of 200) for 6 years 
of data. This reduces chances of detection, highlighting 
the benefits of long-term campaigns. 

5. CONCLUSION 

With the given dataset, we only find a significant 
Trojan-like signal when applying the “left-right” method, 
selecting only L5 data for those candidates that seem to 
exhibit a L4 dip (and vice versa). While we tested this 
method to be robust against a symmetrical bias, it also 
implies that the main sample must be heavily diluted 
with a large number of systems with no (transiting) Tro¬ 
jans. If the method is valid, then the breakdowns of this 
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Fig. 6. — Riverplot of Trojan injection into 6 years of real so¬ 
lar data, assuming PLATO 2.0 instrumental performance. A syn¬ 
thetic 10-day period Hot Jupiter is visible at phase 0.5; the injected 
O. 8 R 0 Trojan is clearly visible in its horseshoe-orbit at L4. 

sub-sample indicate that Trojans are more prominently 
found for longer (>60 days) periods. These cautious, and 
preliminary findings might inspire theorists to advance 
planetary formation theory in that direction; these theo¬ 
ries can then be validated with the upcoming data from 
the PLATO 2.0 spacecraft. 
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